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Abstract. We provide the mathematical framework which elu- 
cidates the way of using a TuUy-Fisher (TF) like relation in the 
determination of the Hubble constant Hq, as well as for dis- 
tances of galaxies. The first step toward the comprehension of 
this problem is to define a statistical model which accounts 
for the (linear) correlation between the absolute magnitude M 
and the line width distance estimator p of galaxies, as it is ob- 
served. Herein, we assume that M = a.p + b — (^, where (^ is a 
random variable of zero mean describing an intrinsic scatter, 
regardless of measurement errors. The second step is to under- 
stand that the calibration of this law is not unique, since it 
depends on the statistical model used for describing the dis- 
tribution of variables (involved in the calculations) . With this 
in mind, the methods related to the so-called Direct and In- 
verse TF Relations (herein DTF and ITF) are interpreted as 
maximum likelihood statistics. We show that, as long as the 
same model is used for the calibration of the TF relation and 
for the determination of Hq, we obtain a coherent Hubble's 
constant. In other words, the -ffo estimates are not model de- 
pendent, while the TF relation coefficients are. The choice of 
the model is motivated by reasons of robustness of statistics, it 
depends on selection effects in observation which are present in 
the sample. For example, if p-selection effects are absent then 
it is more convenient to use a (newly defined) robust statis- 
tic, herein denoted by ITF*. This statistic does not require 
hypotheses on the luminosity distribution function and on the 
spatial distribution of sources, and it is still valid when the 
sample is not complete. Similarly, the general above results 
apply also to distance estimates of galaxies. The difference on 
the distance estimates when using either the ITF or the DTF 
model is only due to random fluctuations. It is interesting to 
point out that the DTF estimate does not depend on the lumi- 
nosity distribution of sources. Both statistics show a correction 
for a bias, inadequately believed to be of Malmquist type. The 
repercussion of measurement errors, and additional selection 
effects are also analyzed. 
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1. Introduction 

Herein, we regard the TuUy-Fisher ( TF) relation for spiral 
galaxies in optical, TuUy & Fisher (1977), and the F aber- 
Jackson relation for E galaxies, Faber & Jackson (1976), as a 



single law providing us with an estimate of the absolute magni- 
tude M = ap + b, where p is called line width distance estima- 
tor. The determination of the Hubble constant Hg, when a line 
width distance estimator is involved has long been discussed 
with respect to the Malmquist bias by different authors with- 
out reaching vet a genera l agreem ent (see e.g .. Bottinelli et al. 

1987 ; Gouguenheim et 
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19861, 



1988a, 1988b; Giraud 



198!: ; Jacoby et al 



1988 
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1985 



Lynden -Bell et al 
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1990 



1987 , 1988 ; San dage 1988a , 1988b 



1988 



TuUyhgsa). The aim of 
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& TuUy 
Teerikorpi 

this work is to enlighten on this problem through a theoreti- 
cal point of view, and to provide us with rigorous formulas for 
ongoing applications. In this first paper, we seek a mathemati- 
cal framework which gives fair rudiments for discussing on the 
use of the Direct TuUy-Fisher (DTF) relation and the Inverse 
TuUy-Fisher (ITF) relation, which both interpret as a choice 
of a fitting technique. 

According to Bigot & Triay (1990b), one must keep in mind 
that a technique of fitting is intimately related to a statistical 
model. Namely, the related statistics (or estimates) are war- 
ranted as long as the values distribution of variables involved 
in the calculation is correctly described by such a model. Hence, 
we understand that a statistical model is as a matter of fact 
required for arguing on the use of either the DTF relation or 
the ITF relation. However, it must be noted that if a statistical 
model is available then nothing prevents us to use solely the 
maximum likelihood (ml) technique. Such an approach has the 
advantage of providing us unambiguously with a unique fit- 
ting technique. Moreover, the related statistics give unbiased 
estimates of model parameters, as long as the statistical model 
takes into account the selection effects. 

Therefore, according to above precepts, in Section 0, we 
define the statistical model which describes the distribution 



of variables involved in the determination of Ho- In Section pi 
we derive the statistics used for the calibration of the M-p 
relation and the estimation of Hq. The influence of selection 
effects on these estimates is also analyzed. In SectionW, we 
investigate the repercussion of measurement errors. Section 
enlightens on the deflnition of a reliable distance estimate of 
galaxies. It is strongly recommended to read the notations and 
useful formulas given in AppendixlAl, these features are ad- 
dressed throughout the text by means of symbol "De/.". 

2. The basic model 

Herein, we specify the probability density (pd) describing the 
distribution of variables involved in the calculation. These vari- 
ables are related to intrinsic quantities of sources (galaxies), 
which are : 

— the absolute magnitude M, 

— the line width distance estimator p, 

— the distance r from the observer, 

— and the radial velocity v (corrected for solar motion). 

For reasons of simplicity in calculations, instead of using the 
variables r and v, it is more convenient to use the distance 
modulus 

^ = 25 + 51ogr, (1) 

where r is given in Mpc, and a similarly deflned quantity 

7? = 25 -I- 5 log u. (2) 

If the peculiar velocities of sources are neglected then the 
Hubble law reads 



r] = n + H, 



(3) 



dPth = F{M,p)dMdp K{ii)dfj., 



(4) 



see (Def.l), where k(/i) accounts for the distribution of 
galaxies in space and F{M,p) for the M-p distribution, i.e., 
the TF diagram. The projection of the TF pd onto the M-axis, 
resp., the p-axis, provides us with the luminosity distribution 
function, 



(5) 



/a,(M;Mo,(7m)= F{M,p)dp, 



fp{p;po,ap)= / F(M,p)dM. 



(6) 



see {Def.l). 

It is obvious that this statistical model, as defined by 
Eq. (0), must be improved for taking into account observational 
and selection effects (e.g., the sampling rules). The difficulty in 
detecting faint galaxies involves necessarily the apparent mag- 
nitude 



■m = M + fi. 



(7) 



If the selection effects depend solely on the apparent mag- 
nitude, then the data distribution is defined by the following 



_ (l>m{m) 
"Jobs — -;:; — 7- — r "-rth, 

Fth[(Pm) 



(8) 

and Pth((j}7n) is 



where TC — 5 log Ho . This equation shows that the variables 
fi and 77 define the same quantitya. If no evolutionary effect of 
sources is present then the distribution of intrinsic quantities 
M and p are independent on fi (since a distance corresponds 
to a time-shift). Thus, regardless of capacities in observation, 
the theoretical pd describing the distribution of above variables 
can be written as follows 



where cpmim) is called selection functior, 
a normalization factor, see {Def.3). 

2.1. Working hypotheses 

In order to achieve the statistical model, as defined by Eq. (Bl), 
we must specify the functions (jimijn), k(p) and F{M,p). It 
turns out that some results can be obtained though without 
a full description of the model, which provides an interesting 
feature for related statistics : (robustness). 

Since the present scope of our analysis is to enlighten on the 
problem of biases, we limit on formulating simple hypotheses 
but sufficiently complete. This has the advantage of avoiding 
cumbersome calculations but providing us with fair rudiments 
common to real si tuations. A more realistic model is given in 
Triay et al. ( 1993a ). In the following, for clearness in under- 
standing, we apply ourself to specify always the working hy- 
potheses used for each step. In general, the standard working 
hypotheses assume : 

»/ii) A magnitude limited complete sample. This property means 
that the selection effect limits to a cutoff at a given limiting 
magnitude, herein designated by niu^. Thus, the selection 
function reads 



(t>m(m) = 6l(miim - m), 

where 9 is the Heaveside distribution function. 
It is obvious that 



mji, 



> max {rrik} . 



(9) 



(10) 



resp., a pd function [pdf) describing the distribution of p's 
values. 



and (in practice) a possible choice is to assume the equality. 

• /i2) A uniform, spatial distribution of sources. The pdf of the 
distance modulus /i, see Eq. (|l|), related to a uniform dis- 
tribution in an Euclidian space is given by 

k(^) cx exp(/9/i), where /3 = — - — . (11) 

o 

Let us mention that, while we focus on a uniform spatial dis- 
tribution of sources, the following calculations and results are 
still valid for power law distribution (3 7^ 3_inio j^^ order to 
specify the TF pdf F{M,p), we must describe accurately the 
TF diagram, i.e., the relation p 1-^ M — M{p). The observa- 
tions show that the data are distributed about a straight line 
(which visualizes the TF relation), that we denote (Atf). Thus 
this line is defined by equation 



This rough description suffices for the pre sent scope, although 
this model is refined in Triay etal. (1993c). 



It is a positive function which works as a filter response (0 < 
(fimim) < 1) with respect to the apparent magnitude. 



M{p) = a.p + b. 



(12) 



In addition to scatter from measurement errors, it k sensi- 
ble to assume that an intrinsic dispersion is also presenta. From 
a theoretical point of view, this might be interpreted as either 
a lack of exact definition of the variable p which accounts for a 
linear relation, and/or the physics of galaxies requires actually 
additional variables for providing the absolute magnitude (i.e., 
the randomness interprets as the effect of these missing vari- 
ables). Hence, we easily understand that a specific approach 
for defining F{M,p) would require to presume a priori the re- 
lated (jHuique) physical process, which is unfortunately not yet 
knownu. 

It must be noted that the goal is not to fit the data to 
such a model but to derive the Hubble Constant Hq. Hence, 
in order to perform the ml technique, we ask whether we may 
substitute F{M,p) by a suitable function which imitates it in 
reproducing the M-p correlation. The next section shows that 
the usual approaches, which consists of using the DTF and 
the ITF relations, interpret as a matter of fact as a particular 
choice of such a function. 

It is advantageous to express a "linear correlation" by using 
a random variable of zero mean , which accounts for the data 
dispersion about the straight line (Atf). In this purpose, it 
seems natural to use the "regular" distance, which is given by 
the segment orthogonal to this line. However, with our theo- 
retical approach, it is equivalent (and suggested by arguments 
of simplicity) to use the following random variable 



C = M -M, 



(13) 



al = (aap - Pth(p,M)aM)' + (l - P?h(P,M)) ali, 



(14) 



where pth(p,M) denotes the theoretical correlation coeffi- 
cient, see {Def.5). It is clear that the existence of an efficient 
correlation is expressed by the following inequality 



ctm 



(15) 



form to be usedu. Now, a second random variable is required 
for specifying entirely the data distribution, let be ^. For a 
trusty description of the TF diagram, the choice of ^ should be 
suggested by the appearance of the M-p distribution. Thus, 
in absence of such information, our working hypotheses for de- 
scribing the M-p correlation are : 

•/13) the TF diagram can be mimic by the pdf 

fdOd^ 5(C;0,ac)dC « F{M,p)dMdp, (16) 

with a Gaussian (^-distribution 

(?(C;0,ac)=PG(C;0,ac), (17) 

see (Def.l.a.). 

The model defined by the pdf given by Eq. (|lq ) is general 
enough to interpret the ITF and DTF methods used in the 
literature. Namely, the ITF relation p = axM + bi, and the 
DTF relation M — a-pp + b-p, correspond to the following iden- 
tifications : 



where M is given in Eq. (ha) , which is proportional to the 
regular distance (which reads (^. cos(arctana)). According to 
Eq. ( |l2| , [l3[ ), obvious calculations show that, regardless of se- 
lection effects, the standard deviation a^ of the ^-distribution 
verifies 



or equivalently by ac; <C \a\ ap. Namely, Eq. ( [L5|) insures 
that p provides a good estimate of the absolute magnitude M 
fromEq. (^. 

In order to proceed with the ml technique, we must pre- 
sume the form of the C,-pdf, hereafter denoted by g{C,\ 0, oq), see 
(Def.l), which mimics the data distribution. In practice, a sim- 
ple regression analysis should help us to guess the candidate 



''The practical argument in favour of this approach is that 
presently the TF diagram is still continuously improved by 
shrewd corrections (as the galaxy orientation with respect to 
the line of sight, the linear dimension of the galaxy region from 
which the line width is measured, etc. . . ), and thus that the 
M estimate given in Eq. ( |l2| ) is approximate. 
''One guesses that this lack of information is mainly responsible 
for the present debate on the choice between the DTF and ITF 
relations, an issue that is clarified below. 



e 



M in the ITF model 
p in the DTF model. 



(18) 



Let us mention that such a definition has the advantage of 
avoiding a confusion which is inherent to usual approaches, e.g., 
see Teerikorpi (1990). Indeed, Eq. (Ilq) tells us that ^ and (^ are 
uncorrelated outcomes, thus the use of conditional probabilities 
allows us to estimate a value of p from a value of M = ^, in 
the case of ITF model, and we have the reverse situation in 
the case of DTF model, see Eq. (|i|,^. On the other hand, 
as long as the related random processes are not specified, the 
usual definitions would wrongly suggest that ax = l/a-p and 
bj — —b-D/a-D- 

3. The technique of fitting 

In the following , we p roceed as follows : for each model, as 
defined by Eq. ( [l6[ - |l8| ) , we investigate the calibration of the 
TF relation [Stepl), see Eq. (|l2|), and the determination of Hq 
{Step 2). 

In order to establish the likelihood function, the pdf is writ- 
ten in terms of observables, see (DefA). The definition of these 
variables depends on the step of the analysis : 

— Step 1 ) For the calibration of the TF relation, the data 
sample corresponds to the following observables 

{lik,Pk,Mk=mk- iik}k^i^ff^, (19) 

seeEq. (0). 

— Step2) For the determination of the Hubble constant, it is 
more convenient to use the following observables 

X — m — rj, (20) 

y — a.p + b + rj — m, (21) 

see Eq. (W^ , and the data sample corresponds to 

{Vk,^k,yk}k=i^N^- (22) 

We use the following notations : A'^i, for the sample size, {)i, for 
the sample average, Covi, for the sample covariance, etc. . . , in 
Stepl, while A'^2, {)2, Cov2,etc. . . , in Step2, see {Def.5), which 
helps us to disentangle the statistics involved in each step. We 
must be aware that the random variables y and (" (thus ctj) 
are model dependent (through the estimates of a and 6), while 
they are uniquely defined by Eq. (|I|,^j20l). 

^It is obvious that it must not be exponential, in order to ensure 
an effective M-p correlation. 



3.1. Regardless of the Tully- Fisher relation 



/M(M;Mo,aM)dAfgG(C;0,crc)dC ^ F{M,p)dMdp. 



(28) 



For reasons that appear clear in the following, let us derive the 
ML estimator of the Hubble constant when the TF-relation is 
ignored. The related statistical model is obtained by integrat- 
ing the pd given by Eq. (Ul) over the variable p. According to 
Eq. (0), we obtain a theoretical pd which reads 



fM{M; Mo, (TM)dM n{^)d^.. 



(23) 



By writing this pd in terms of observables x and r), 
see Eq. (yfeol), we easily understand that, for deriving an 
Ho estimator, we must specify the pdfs fM{M; Mo,(7m) and 
ii{fj.). Let us assume : 

•/i4) a Gaussian luminosity distribution function, 

fM{M;Mo,aM) = 9G{M;Mo,aM); (24) 

and hypothesis (^2), i-e., a uniformly spatial distribution, see 
Eq. (|ll|). Hence, the If is given by 



N2 



(25) 



see (_De/.4.b). Obvious calculations show that the normal- 
ization factor does not depend on Hq as long as the selection 
effects are free of velocity criteria. Hence, the ml equation pro- 
vides us with the following statistic 



7t 



{Mo 



(ia\i) 



{^h 



(26) 



independently on whether the sample is complete up to 
a limiting magnitude. It is important to mention that, since 
dPthi4>m)/dTi. — 0, the term Pa-f.j m Eq. (M) does not inter- 
pret as a Malmquist bias correctiontl, see (Def.Q). In the case 
of ry-selection effects, it is easy to show that the estimator (M) 
transforms by substituting /3 by l3 + dlnPth{4>m4'v)/9T~C, where 
(f)y is the selection function describing the selection effects on 
velocities. In order to calculate the accuracy of Eq. (^6|), we 
need to specify the form of the selection function. For a mag- 
nitude limited complete sample, i.e., (hi), see Eq. (bl), the x- 
distribution reads oc gaix; Mq— 7i — /3((tjv/)^, o"a/), which shows 
that the standard deviation is equal to 



a-f^c 



Cm 



(27) 



It is obvious that more accuracy is expected when the 
TF relation is used. 

3.2. The inverse Tully-Fisher relation 

Now we take into account the TF relation, by using the pdf de- 
fined by Eq. (^^. The ITF model is specified by the choice 
^ = M, which rapans that the random variables M and C, 
are not correlatedu. Namely, the data distribution in the M-p 
plane is supposed to be described by the following pd 



'^ While the sample average of absolute magnitudes in a magni- 
tude limited sample is indeed biased because of the Malmquist 
effect, it is given by {M)2 ~ Mo — /Serf/ under hypotheses 
(hi,h2,hi). 

^Actually, these variables may not be necessarily independent, 
e.g., the C,-pdfraa.y be Gaussian with a M-dependent standard 
deviation, p ((; 0, o-<;(M)), while Eq.(|5|j2|) are still fulfilled. 



Let us remind that the precise rule of this pd is to mimic 
the data distribution, without interpreting the physical pro- 
cess involved in the TF diagram. This explains the respective 
locations of terms with respect to the equal sign in Eq. ^^ . 
The calculations, given in Appendix^, provide us with the fol- 
lowing results : 

Step!) The ml equations yield statistics of a ~ oP'^ , b ~ 
b^"^^ and at^ ~ cr^^^ , which are defined as follows 



ITF 



Covi(p,M)' 
Ei(M) 



(P)i 



piip,M) 



1. 



(29) 
(30) 
(31) 



It is interesting to note that, regarded as conventional estima- 
tors, these statistics show no correction for the Malmquist bias. 
Let us emphasize that they are valid for any form of the se- 
lection function (j>m. We easily understand that such a feature 
is of particular interest because a smooth decreasing function 
describes the selection effects on apparent magnitude more re- 
alistically than a sharp cutoff, as it is assumed by hypothesis 
(hi). Moreover, because of the same reasons, it turns out that 
these statistics still work whatever the forms of functions /m 
and K, i.e.r]for any type of luminosity and spatial distributions 
of sourceaJ. The (mathematical) reason of such properties is 
that the normalization factor Pth{4>) does not depend on model 
parameters a, b and o-(, which is the case when the selection 
function reads (j) = 0™ or = (f>m 0n, see Appendixpl 

Stev 2) It turns out that the form of the ITF pdf, as given 
by Eq. ( P8|) (see also Eq. (B8)), allows us to derive straightfor- 
wardly a first estimator, which is given by 



H' 



= {yh 



(32) 



It provides us with Ho within the standard deviation 

ITF 



/iVa' 



(33) 



see Eq. (|29[). Let us emphasize that it is obtained by pre- 
serving the above advantages, i.e., without assumptions on the 
completeness of the sample, the spatial and luminosity distri- 
butions. 

On the other hand, the derivation of the ml statistics forces 
us to specify the functions k and /i\/. Thus, in addition of hy- 
potheses {h2,h'i), see Eq. ( [Ll|jlq ), we assume a Gaussian lumi- 
nosity distribution function (hi), see Eq. (PJ). Hence, we obtain 
the following Ho statistic 



T-/r 
ITF _ ij_ 



+ 7'H^ 



1 + 7^ 



(34) 



where 7 = 7'''"^ , see Eq. (|l3,Gq,B2|) . The accuracy of such 
an estimate is calculated by specifying the function 0^. By 
assuming {hi), we obtain a standard deviation of 



Note that the pdf k might simultaneously describe the selec- 
tion effects on distance. 



(T-^ITF 



V^ 



(35) 



- 'Y^ 



It is obvious that Eq. (M) can be interpreted as a weighted 
mean of Ho estimators, where the weighting factors correspond 
to related accuracies. Equation (B3) shows that the ITF esti- 
mate is more accurate than the ITF* one. Hence, we under- 
stand that the advantage of having less constraints on the va- 
lidity domain of the ITF* estimator (i.e., to have weak working 
hypotheses for defining the ITF* model), is to the detriment 
of the accuracy of estimates. 

It turns out that both estimators Ti.^^^ and Ti.^^^ can 
be used even when the sources are selected upon velocity 
criteria. The reason is that the related selection function 
(jimix -\- ri)(f)v{ri), does not disturb th e ind ependence of y with 
respect to variables rj and x, see Eq. (B8). On the other hand, 
these statistics become ineffective when selection rules are 
based on p, because Pth{4>m 4>v) depends as a matter of fact 
on a and 6. 

It is interesting to note that little algebra allows us to write 
Eq. (p3) as 



W 



{{yh-p{oY^f)+^^Kh-{xh) 



1 + 7' 



(36) 



While providing the same quantity as Eq. (M), Eq. (pd) 
is a different weighted mean of two quantities which are not 
Hq estimators. The interpretation of this formulae is enlight- 
ened in Section pi 

3.3. The (direct) TF relation 

The underlying working hypothesis used in the DTF approach 
is that the random variables p and C, are not correlated. Namely, 
the data distribution in the M-p plane is supposed to be de- 
scribed by the following pd 



fp(p;po,ap)dp gG(C;0,o-c)dC ~ F(M,p)dMdp. 



(37) 



It turns out that this approach forces us to presume a priori 
the form of functions <l>m( rn) , frf (p; Pn, (J„) and fi:(/i). Hence, we 
use (/ii,/i2,/i3), see Eq. (|9| Jll| , |l6| ) , and for reason of coherence 
with /14, we assume : 

»h^) a Gaussian p-distribution, 

/p(p;Po,o-p) =5G(p;Po,crp). (38) 

The calculations, which are given in AppendixO provide us 
with the following results : 

Step 1) The likelihood equations yield statistics of a 



,DTF 



6 ~ fe" ^ * and a(; ~ cr^^^ , which are defined as follows 
DTF _ Covi(p,M) 



iDTF 



(Si(p))2 
= (M)i-a^ 



DTF 



DTF\2 



(p>i+/3(-r") 



= Ei(A/)v/l-p?(p,M). 



(40) 
(41) 



It is interesting to note that, regarded as conventional statis- 
tics, only the estimator given in Eq. ( |39[ ) shows a correction for 
a bias. 

Step 2) The ml statistic of Hq is given by 



H" 



{yh^p{ary, 



and has a standard deviation equal to 

DTF 



/7V2 



(43) 



seeEq. (^. 

The above statistics are no longer valid when selection 
effects on p, and as well as on n, are present. Neverthe- 
less, they can easily be adapted by rewriting the p-pdf as 
fp{p) oc <j)p(p)fp(p;po,Op), for taking into account p-selection 
effects. Let us emphasize that the correction in Eq. (|42[) is not 
of Malmquist type, in contrast with the one in Eq7(p9[) , see 
Sectionpl Moreover, it must be noted that the magnitude of 
the bias does not depend on the limiting magnitude miim, and 
of (Tp (or equivalently um). 

3.4. Comparison of estimators 

It is important to understand that the Hubble constant Ho has 
a similar status among these models, contrarily to parameters 
a and b which define the data distribution on the TF diagram. 
Indeed, it must be noted that the ITF model inherits all model 
parameters defined in the ITF* model, simply because it is a 
particular case, where the functions <j)m{nn), n{fi) and fM{M) 
are specified. Hence, it is clear that Ho keeps an identical 
status. On the other hand, because the DTF model and the 
ITF models describe the data distribution in a different way 
(see Appendix n), we might expect to obtain a different sta- 
tus. However, let us note that the pd given in Eq. (^3|) is the 
projection of the ITF* pd, as well as the DTF one. Therefore, 
the model parameters which are in common (i.e., which are not 
cancelled by the projection) , are identical among these models, 
which is the case of Ho. 

Therefore, according to previous sections, which show that 
Eq. (Eq,B2y34,k3) define unbiased Ho statistic, the related esti- 
mates (for a given sample) correspond as a matter of fact to 
the same quantity, and the discrepancies (between these differ- 
ent estimates) interpret as statistical fluctuations which should 
vanish when the sample size increases. With this in mind, we 
investigate the nature of such discrepancies. These quantities 
can be derived from the following ones 



An* 
Aci* 
Adi* 






n 

ITF 



ITF* 



(44) 
(45) 
(46) 



where the ITF* is chosen as a reference estimate. According 
to Eg. ([14IJ44I ), the difference between the statistics given in 



(39) An. 



7 



l+-y2 



rAc 



(47) 



where 7 = 7 



Eq.(|l|) 



Thus, 



the smaller the ratio 
7 , the smaller the discrepancy An*. Let us elucidate this 
particular property, which shows clearly the gain of knowledge 
on Ho when the TF relation is used. It is important to note 
that the estimates Ti. and Ti}"^^ , given in Eq. (EqkTI), are in- 
dependent, i.e., they involve two different types of information. 
Indeed, the TiP is based only on characteristics related to the 
luminosity distribution function of sources (or equivalently, on 
(42) the p-distribution), while the 7i'^^ takes into account only 



the TF relation (i.e., the (^-distribution). When both features 
are used, we obtain a more accurate estimate Ti.^^^ , which lies 
between Ti. and Ti}"^^ . Actually, it lies much more close to 
Ti.^'^^ , accordingly to Eq. (Ilq), which shows that this estima- 
tor is less sensitive to hypotheses on the luminosity distribution 
function of sources, which translates the robustness of the es- 
timator H^'^^* . 

In order to calculate Adi* , let us compare the statistics 
(|§-||) and (|3§-|9|). After little algebra, we obtain 



0"Ar 



DTP 2/ ,,N ITF 

a = pi{p,A4) X a , 



(48) 
^DTF ^ ^iTF ^^^_ pl^^p^M)) (a^^^{p)i +/3(Ei(Af))^) , (49) 

o^^ = \p,\y,af\ (50) 

Hence, according to Eq. (|9[||,^J44|) , it foUowJa 



Adi* 



Pi 



ITF 



Cp — 



1piI ^i-f 72 

where 7 = 7'"""^ , see Eq. ( |l5| ) , pi — pi {p, M) and 

{p)l- {p)2 



Si(p) 



(51) 



(52) 



Thus Adi* depends essentially on two independent char- 
acteristics. The first one is the discrepancy of sample aver- 
ages of p values between the calibration sample and the one 
used to determine Hq. The second one is the accuracy of the 
TF relation, and similarly as above, Eq. (pll) shows that the 
smaller the ratio 7'''"^ the smaller the discrepancy. Let us em- 
phasize that such a feature can also be interpreted in terms of 
M-p correlation, since we have 7^/(1 -I- 7^) ~ 1 — pi(p, M)^ , 
see Eq. (|I|J29|). Thus, the higher the value of \pi{p,M)\ the 
smaller the discrepancies given in Eq. (k^,Bl|) . The hypotheti- 
cal case \pi{p,A'I)\ = 1 is a singular situation, where the data 
distribution on the TF diagram coincides with the straight line 
(Atf) defined by Eq. (|l|), which makes the ITF*, the ITF and 
DTF approaches identical. 

Now let us calculate the expected orders of magnitude of 
discrepancies given in Eq. (M-k4), and their depend ence o n 



the sample size. Note that the statistics defi ned in Eq. (|26l| 
considered as random variables (see Sec. B.2), are independent. 
Hence, Aci* , resp. An* , both have a vanishing expected value, 
with a standard deviation 



0"Ar 



VTTf 



Cm 



resp. 



o"A, 



Vl + T^ 



/7V2 



(53) 



(54) 



Therefore, the difference between the ITF and the ITF* es- 
timates is not systematic, and thus there is no bias. Moreover 
these estimates coincide as the sample size N2 increases. Simi- 
larly, since {p)i, resp. {p)2, is a statistic providing the mean po 
with a standard deviation of (Jp/\/ Ni, resp. ap/\/N2, and that 
Step,l and Step2 are independent, then Cp is a random variable 
of vanishing mean with standard deviation ~ -^/l/A^i + I/N2. 
Hence, Adi* , has a vanishing expected value, with a standard 
deviation 



Let us remind that the average {y)2 is a model-dependent 
quantity. 



yr+^ 



ITF 



^1/Ni + 1/N2. 



(55) 



Thus there is no bias between the DTF and the ITF* es- 
timates, while the estimates of model parameters a and 6 are 
different, see Eq. (WSl-Mq). Nevertheless, it must be noted that 
these estimates coincide only when both sample sizes A'^i and 
A'^2 increase, which emphasizes the importance of the calibra- 
tion of the TF relation. 

Let us remind that the ITF and the DTF models belong to 
a single class of models defined by Eq. dig ) . Intermediate mod- 
els can be obtained by means of a rotation parameter which 
makes a link between the ITF and the DTF models. Thus, we 
understand that simple arguments (of linearity) indicate that 
the Ho statistics related to these mo dels p rovide asymptoti- 
cally identical estimates, Triay et al. ( 1993b| ) . More generally, 
we might ask whether this is still true for models describing 
the TF diagram in a more complex way. The element of answer 
comes by noting that we have y = (^ + TL, see Eq. ( |l2|jl3[2C| ), 
and that Pth(i/) = H independently of hypothesis (/13J, which 
forces the random variable C, to have a vanishing mean value. 
Thus, we can claim that if this condition is complete then the 
way of describing the data does not influence the determination 
ofJ^o. 

On account of these results, let us ask whether the accuracy 
of estimates may be used as criterion for ha ving a preference 
for a particular model. According to Eq. (|l5lp3,Ba, kslkq), it 
turns out that we obtain 



7.^DTF ^ (Jj^lTF , 



(56) 



while the ITF* estimate is less accurate, see Eq. (p3) . Thus, 
we have similar precisions in estimating Ho by using indif- 
ferently either the DTF or the ITF approaches. Work is in 
progress for checking whether an intermediate statistical model 
which describes the TF relation might pr ovide h igher accuracy 
on the determination of Ho, Triay etal. ( 1993b ). 

According to Eq. (|5l|), it is interesting to note that if the 
following equality 

Cp = 0, (57) 

herein called "Cp-criterid' , is fulfilled for a given sample, 
then the algebraic expressions of the ITF* and the DTF es- 
timators of Ho become identical. On the other hand, if the 
^Vp- criteria" is not verified then nothing prevents us to resam- 
ple (to remove data according to rules allowed by the working 
hypotheses) on this purpose (although it reduces the sample 
size, and thus diminish the information). Namely, the faintest 
objects can be removed until Eq. (pm is complete, which pre- 
serves the selection rule based on the magnitude selection effect 
(i.e., the hypothesis hi), but with a brighter limiting magni- 
tude. 

3.5. Applications 

In order to have a visual support for our theoretical approach 
and to investigate the influence of calibration errors in the de- 
termination of Ho, we perform A^'s = 1000 simulations. We 
generate two sorts of samples : the {mt, iJ.k,Pk}k=i.Ni, which 
is involved in the calibration step, and the {rrik, '7fe,Pfc}fc=i,]V2, 
which is involved in the determination of the Hubble constant 
Ho, this one contains N2 — 100 objects. For each sample, we 



have three independent data sets : the absolute magnitudes 
{Mfc}, the intrinsic TF dispersions {Cfe}, according to working 
hypotheses (/i3,/i4), see Eq. ( [lq , p4J ), and the distance moduli 
{/ifc}, according to working hypotheses (fti,/i2), see Eq. (|ll|); 
the {pk} are derived from Eq. ( |l2| , [L3[ ). The apparent magni- 
tudes {rrifc} are given by Eq. (|7|), and the cosmological velocity 
moduli {rjk} are calculated according to Eq. (H), with a Hubble 
constant given by 



Ho = 100kms~VMpc. 



(58) 



The characteristics of samples are the following : 

— completeness of samples, up to a limiting magnitude of 
mii^ = 12; (59) 

— a uniform spatial distribution of sources; 

— a Gaussian luminosity distribution function, defined by 

Mo = -19 , o-M = 1.5; (60) 

— we assume (a priori) the ITF model, so that the TF dia- 
gram shows a normal (^-dispersion at constant M defined 
by 

a(^ = 0.5, (61) 

and with the following calibration parameters 

a = -6 and b = -7. (62) 

A first set of simulations is performed in order to investi- 
gate the Ho statistics regardless of calibration errors. In this 
case, a unique calibration sample is used, the sample size of 
N[ = 8 000 galaxies is large enough so that the estimates 
of model parameters a, b and (t<;, as given by Eq. ( |29| - |29[|39| - 
bS), are expected to be free of statistical fluctuations. The re- 



Table 1. Comparison of estimators. The estimates of param- 
eters a, b and ct^ are obtained from a sample of N[ = 8 000 
galaxies (which gives values which are free of statistical fluc- 
tuations). The related standard deviations are obtained from 
30 trials of such samples. The estimate of Ho corresponds to 
the mean value on A^s = 1 000 trials with samples of A'^2 = 100 
objects. 



Parameter 



a 
b 

Ho — Ho 



ITF* 



ITF 



DTF 



-5.99 ±0.02 

-7.02 ± 0.04 

0.500 ±0.004 

-0.2 ±2.4 -0.2 ±2.2 



-5.39 ±0.02 

-8.22 ±0.05 

0.475 ± 0.003 

-0.2 ±2.2 



suits are given in Tabled. It is reassuring to note that the ITF 
estimates of model parameters a, b and a^, correspond as a 
matter of fact to values used to generate the random sam- 
ples, see Eq. ( |62|) , while it is not the case for the DTF esti- 
mates. These estimates are used to determine Ho, according 
to Eq. (||,|^J42l), on TVs = 1000 samples of N2 = 100 objects. 
The average of these estimates and their related accuracy (la) 
are given in Tablehl. In agreement with the theory, we can 
note that these statistics give back the value Hq, used for the 
simulations, which shows that they are not biased. Moreover, 
we can note that the related accuracies are in agreement with 
the expected value obtained from Eq. d33,pa,[43|) , where we use 
a-H ~ E(7i). Figure hi shows 100 _ffo-estimatcs from these dif- 



Fig. 1. Comparison between the ITF, ITF* and the DTF estimates, 
without taking into account calibration errors. The x-coordinate of 
symbols correspond to ITF estimates. The y-coordinate of symbols 
"-I-" correspond to DTF estimates, while the y-coordinate of symbols 
"O" correspond to ITF* estimates. 



ferent approaches. The symbols "±" correspond to the DTF 
versus the ITF estimate, and the symbols "Q" correspond to 
the ITF* versus the ITF estimate. The evident distribution 
along the diagonal shows that these approaches are as a mat- 
ter of fact equivalent. Moreover, we can speculate that it is 
advantageous to prefer the ITF* approach, which is more ro- 
bust, since there is no real gain of accuracy by choosing the 
other ones. It is interesting to note that the distribution of 
symbols "Q" is more scattered about the diagonal than the 
symbols "±". The differences of accuracy between these esti- 
mates don't suffice to account for such a gap (which can be 
estimated of the order of 0.01). Therefore, this means that 
the ITF*-ITF estimates are less correlated than the DTF-ITF 
ones, while one might expect the opposite. Indeed, let us re- 
mind that the ITF model is defined from the ITF* model by 
specifying the functions 0™,, k and /m, and thus it is simply 
a particular case of the ITF* model, while it is different from 
the DTF model. The reason lies in the amount of information 
which is used by these estimators. Indeed, the ITF* estimator 
uses less working hypotheses than the ITF and the DTF ones, 
which makes it more slacker. 



Table 2. Effects due to calibration errors. These estimates 

correspond to mean values calculated on A^s ~ 1 000 trials. The 

(2) 
parameters a, b and a( are obtained from samples of N{ — 30 

galaxies, while Ho is determined from samples of A'^2 ~ 100 

objects. 



Parameter 



a 
b 

^0 — ^0 



ITF* 



ITF 



DTF 



-6.06 ± 0.40 

-6.84 ±1.02 

0.498 ± 0.075 

-0.2 ±4.4 -0.2 ±4.0 



-5.40 ±0.32 

-8.21 ±0.85 

0.469 ± 0.064 

-0.1 ±4.0 



In general, the calibration of the TF relation is performed 
only on few tens of galaxies, which makes the estimates of 
model parameters a, b and gq much less precise. Hence, the 
determination of Ho undergoes the related statistical fluctua- 
tions, herein called calibration errors. So such effects are inves- 



tigated by using simulated calibration samples, with a more 
realistic sample size of A'^i = 30 galaxies, and by determining 
Ho on samples of N2 — 100 objects. The statistical analysis 
is performed on A'^s = 1 000 trials. The results are shown in 
Table y, which gives the averages of parameters estimates, and 
their related accuracy (la). The comparison with Table|l| shows 
that the estimates of model parameters a, b and a^ are simi- 
lar, and that the estimation of Ho is not biased by calibration 
errors, while it is obviously less accurate. Figure S shows more 



Fig. 2. Comparison between the ITF, ITF* and the DTF estimates, 
by taking into account calibration errors. The same caption as in 
Fig. 1. 

elongated distributions than those in Fig. hi, but still about 
the diagonal. At first glance, the main result is that the above 
conclusions are still valid when calibration errors are taken into 
account (actually, it is easy to note that these approaches are 
even much more equivalent, since the scatter of symbols "+" 
about the diagonal is now comparable to the one of symbols 

"O"). 

4. About measurement errors 

It is clear that Malmquist bias is present only when a part of 
the (luminosity) pdf is not observed. Since this is independent 
of errors distribution, we understand that measurement errors 
do not produce such a bias, see {Def.6). Notwithstanding, in 
order to answer the question of whether these effects intro- 
duce another type of bias, we must use a precise mathematical 
framework for avoiding misunderstandings. 

4-1. The statistical model 

Let {em,ei_i,ep,en} denote the measurement errors. We use the 
symbol hat "'" to distinguish the (measurable) variables (i.e. 
the ones which are affected by measurement errors), from for- 
mal ones which are given by 



m 



m - 



P = P' 
77 = 77- 



(63) 
(64) 
(65) 
(66) 



If these errors are independent Normal random variables, they 
are distributed according to the following the pd 



d(^) 



dPy'= n 9G{ey,0,ail^)de>., 



where the index A takes character values among the set 
A*^^^ — {m,/i,p} or A''^' = {m,ri,p}, depending on Step! or 
Step 2. Since a magnitude limited sample can be selected from 
a catalog, where the data are already affected by measurement 
errors, we easily understand that the selection function must 
be written in term of measurable observables. Therefore, ac- 
cording to Eq (fel), the pd which describes both the observables 
and the random errors is given by 



dPl 



(s) 



°th (Pi^' 



dP,^xdPi'\ (s = l,2), 



where 



4>m{m) = (t)m{m) 



(68) 



(69) 



It is clear that since the errors e\, see Eq. (p3l-p3), cannot 
be disentangled from intrinsic scatter, the ml technique is not 
feasible for obtaining genuine statistics. However, we can over- 
come this obstacle by substituting the suitable corrections by 
their expected values, which can be calculated according to 
the pd given in Eq. (B8) . For convenience in writing, we use the 
following dimensionless quantities 



Sr, — 



5m 



(Si(p))2' 

(Ei(Af))2 



(70) 
(71) 



see Eq. am, and Cp, see Eq. (p2). It turns out that the spatial 
distribution of sources must be specified (i.e., k{h)) a priori 
in order to perform such calculations, see AppendixM. If we 
assume that it is uniform (/i2), see Eq. ([ll|), then the normal- 
ization term is given by 



3(s) 



0)=p« 



) exp 



ili^'syy 



(72) 



where Pth {4>m) is the normalization term when measure- 
ment errors are not taken into account. The effects due to 
measurement errors lie only in the extra term, which turns 
out to be independent on model parameters, and thus which 
ensures the absence of Malmquist bias on estimating these pa- 
rameters. Nevertheless, there are biases of different nature in 
the Ho estimates, which are given in Eq. (bq,p3,k2|) , since the 
calculation provides us with 



Ti = H +CpdMa Si(p) 



+ 



V> 






H" 



n 



DTP I fr "p iDTF 



a""Si(p) 



/3(K„' 



Up 



(73) 
(74) 

(75) 



The bias free ITF statistics is obtained by substituting in 



(67) Eq.(Q the terms VF"^* and IHp , as given by Eg 

Therefore, we see that the statistics given in Eq. (G6ll3iB4E2| 



can be restored as long as the standard deviations of errors are 
known, i.e., a^^ and a^^ in the oaee of the ITF model, and 
o-fp in the case of the DTF modelcj. However, it is clear that 
these corrections should be tiny quantities, since 5p <C 1 and 
&M ^ 1, unless the information is buried into noise. Moreover, 
it is interesting to note that they are of different nature, the 
first one depends on TF characteristics and can be removed 
by using the C^- criteria, see Eq. (p^), while the other one does 
not. 

Jf-.Z. Applications 



Similarly to Section 3.5, we perform simulations in order to 



enlighten on above results and to investigate the effects of mea- 
surement errors on the accuracy of estimates. According to 
working hypotheses, we use simulated samples with character- 
istics given by Eq. (|58l-n2|), where the observables are perturbed 
by no rmal random errors. In practice, according to Gouguen- 
heim ( 19931) , the observables are measured within the following 



accuracies : 

— the line width (which gives p) is measured within 20 km/s; 

— the recession of galaxies is given within 15 km/s; 

— for calibrators (Stepl), the apparent magnitudes are mea- 
sured within an accuracy of 0.05 mag., while for Step 2, the 
accuracy depends on magnitude, it is of order of 0.1 mag. 
for m < 13, of 0.15 mag. for 13 < m < 14, and of 0.2 mag. 
for m > 14. 

These above uncertainties can be interpreted as the 2-3 er- 
rors standard deviations. We can use a good compromise on 
the magnitude of errors for avoiding their dependence on the 
magnitude of the related variable by assuming the following 
characteristics : 



ai'2 = 0.05, 



ai^'> = 0.025, 



a'i' = 0.15, 



(76) 



and for samples used to determine Hq, according to 
Fouque(1995), we choose 



cr<f2 = 0.15, 



a<? = 0.025, 



.(2) 



= 0.0. 



(77) 



The effect on Ho statistics due to measurement errors, 
but regardless of calibration errors, is investigated by using 
a unique calibration sample with N[ — 8 000 galaxies, and a 
sample of N2 — 100 objects for the Ho determination. Theses 
samples are generated according to Eq. (p8l-p2), and both are 
perturbed by normal random errors with characteristics de- 
fined by Eq. ( |76|J77[ ) . The statistical analysis is performed on 
Ns = 1 000 trials. The statistics of model parameters a, 6 and 
a(, corrected for the bias due to measurement errors are de- 



fined in Eq. ( |E14^17| ). The related results are given in Table|3|, 
which shows the averages of model parameters estimates and 
their related accuracies (If), and the magnitude of the cor- 
rection terms which are present in the Ho statistics, given in 
Eq. ( |73| - |73| ) . We can note that these corrections are effective 
since the mean value of Ho estimates gives back the value Hq . 
However, it is clear that this is a minor quantity compared to 
the _ffo standard deviation. Figure H shows the same diagram as 
in Fig. hi The comparison between these figures indicates that 
the measurement errors do not perturb the correlation between 
the ITF*, the ITF and the DTF methods. A similar analysis 



Note that these criteria can motivate the choice of the model 
to be used. 



Table 3. Measurement errors, without calibration errors. The 
results are based on A'^s ~ 1 000 trials. The parameters a, h and 



(Tj, obtained from samples of N^ 
is measured from samples of N2 
term A^Ho is writen in Ho unit 



(1) 



8 000 galaxies, while Ho 
100 objects. The correction 



Parameter 


ITF* ITF 


DTF 


a 


-6.00 ±0.03 


-5.40 ±0.02 


b 


-6.99 ±0.07 


-8.20 ±0.06 


^C 


0.501 ± 0.005 


0.475 ±0.004 


Ho — Ho 


-1.2 ±2.5 -1.2 ±2.4 


-1.2 ±2.4 


A,Ho 


-1.27 ±0.08 -1.27 ±0.07 


-1.27 ±1.79 



Fig. 3. Comparison between the ITF, ITF* and the DTF estimates. 
The calibration error are taken into account. The same caption as 
in Fig. 1. 



is performed by taking into account simultaneously calibrat ion 
errors. The method of proceeding is identical to Section 3.5 
above. The results are given in Tabley, and in Fig. q. The main 
effect of calibration errors is to increase the standard deviation 
of both the correction term and Ho, which does not change our 
previous conclusions, while the ITF estimate seems to be 10 
percent more accurate. 



Table 4. Measurement errors, 
same caption as in Table 3. 



with calibration errors. The 



Parameter 


ITF* ITF 


DTF 


a 


-6.03 ±0.42 


-5.43 ±0.35 


b 


-6.90 ±1.06 


-8.12 ±0.90 


^C 


0.482 ± 0.084 


0.457 ±0.073 


Ho — Ho 


-0.6 ±5.2 -0.6 ±4.8 


-0.8 ±5.0 


A,Ho 


-1.29 ±0.13 -1.29 ±0.11 


1.25 ±1.77 



5. The distances of galaxies and Ho 

Within the same framework, let us investigate the problem of 
finding a reliable determination of distances of galaxies, by us- 
ing the TF relation. This still involves a first step of calibration 
of the TF relation, and thus we refer to above results. On the 
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Fig. 4. Comparison between tlie ITF, ITF* and tlie DTF estimates. 
The calibration errors are taken into account. The same caption as 
in Fig. 1. 



other hand, the next step does not involve a second sample, 
but only a unique galaxy, whose the related data are denoted 
by ruk and pk- At first glance, according to Eq. ([zllia), the dis- 
tance modulus of a galaxy of known apparent magnitude ink 
and line width distance estimator p^, can be estimated by the 
following statistic 



jlk ^ruk — {a.pk + b). 



(78) 



We easily understand that, because a and b are (as a matter 
of fact) model dependent parameters, similarly to the above 
approach, the question of whether this (ad hoc) estimate is 
biased takes its answer only within a given model. Thus we 
have to define a pd which describes the distribution of distance 
modulus fi provided m = jn^ and p — pt, that we denote 
dP/l = /h(m; Mo ' crjj' )dfi. According to Eq. (|8|), by using con- 
ditional probabilities, we have 



dP, 



(k) ^ S{m - mk)S{p ~ Pk) 

Pohs{5{m-mk)5{p-pk)) 

and thus 



dPob 



(fc) „(fc)^ 



/m(m;mo .'^m ) = 



F{mk - fJ,,Pk) k{jj,) 
J F{mk - ^J.,Pk) K{^)dp. 



(79) 



(80) 



Since no selection function intervenes in this equation, we 
deduce that no Malmquist bias is present. Moreover, the ex- 
pected value of the distance modulus, which is given by 



(fc) 



fj,dP, 



(k) 



(81) 



see Eq. (l79|), provides us with the most likelihood statistic. 
It is clear that such an estimate depends on the form of the 
TFpdf, which reads 

F(mk- fi,pk) = gG(M;Mfc,o"c) 

'afMirrik - fJ.; Mo, ctai ) (ITF) 



.UPk;Po,a^) (DTP) (^2) 

see Eq. (Pq,p7^, |8l|). Hence, the pd given in Eq. (|79|), transforms 



dP^, oc gG{f^;ftk,crc:)K.{p)dfi 



fM{rnk- ^■,Mo,(Jm) (ITF) 
1 (DTF) 



(83) 



The very interesting feature, which is shown by Eq. (p3|) , is 
that the distance modulus estimate given by the DTF model, 
see Eq. (pl|), does not depend on the luminosity distribution 
function of sources. Secondl y, ( in all cases) the distance mod- 
ulus estimate given in Eq. (Mm must be corrected for a bias. 
In order to calculate the correction term, we have to specify 
the luminosity distribution function /m {M) in the case of the 
ITF model, and the function K{pb) in both cases. So let us as- 
sume that the sources are uniformly distributed in space (/i2), 
and eventually that they show a Gaussian luminosity distri- 
bution function (/14). Hence, by using (Z)e/.2.c,d), we obtain 
straightforwardly the following distance modulus estimate 



^i.k)^\^{{fik+Pal)+^\n,k-M,)) (ITF) ^ ^g^^ 



Hk + I3a^ 



(DTF) 



where 7 = 7'^^ , see Eq. (113) , with a (accuracy) standard 
deviation a,, = Oa, 






(ITF) 
(DTF) 



(85) 



According to Eq. ( [L5| , p9|j4^ ) , it turns out that these dis- 
tance modulus estimates have similar accuracy, while infor- 
mation is used for calculating the ITF distance modulus esti- 
mate. By using Eq. ( [l8[]4q ), we easily calculate the difference 
A/i — fi^"^^ — fjP'^^ between the distance modulus estimates. 
We obtain 



Afi = 



1+7' 



(a''^^(p)i + &''^^ + /3(Ei(A/))2 



Mo) 



(86) 



Let us emphasize that a^'^^{p)i + b^'^^ + /3(Ei (i\f))^ is an 
unbiased statistics which gives Mq within (jm / \/N\. There- 
fore, Eq. ( pq ) shows that the difference turns out to be a tiny 
quantity. Namely, A/^ has a vanishing expected value, with a 
standard deviation given by 



0"An 



^^VT+T^ 



(87) 



where 7 = 7'''"^, see Eq. (Ila). Therefore, this shows that we 
obtain the same distance modulus estimate by using different 
models. 

Finally, we come to the conclusion that the choice of the 
model should be based on the reliability of hypotheses used 
about the selection effects, and it is interesting to note that 
the DTF approach is more robust than the ITF one. The cor- 
rection terms in Eq. (|8^) are not related to biases of Malmquist 
type but identify to volume corrections, herein calculated for 
homogeneous spatial distributions or of power law type (i.e., 
(5 7^ ^'"•^° ), the mhomogeneous case is straightforward. 

Obvious calculations show that the effects of calibration er- 
rors on distance modulus estimates make them less accurate by 
introducing a white noise of a p-dependent standard deviation 
given by 



= ./a|p2-2Cov(5a,5b)p + a| 



(88) 
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where Sa and Sb denote the calibration errors. The simula- 
tions show that 



O'Sa 



us^ = 



Caw{5a,5b) = 



0.21 
0.22 

0.53 
0.56 

-0.11 
-0.12 



(ITF) 
(DTF) 

(ITF) 
(DTF) 

(ITF) 
(DTF) 



(89) 
(90) 
(91) 



Now it is natural to make the link between the distance 
modulus and the Ho estimates. A simple formal comparison 
between Eq. ([34ll43,R4J) , provides us with 



1 



N2 



N2 ^^ 
fe=l 



iVk 



mD 



(ITF) 
(DTF) 



(92) 



It is obvious that such an equality is valid within the com- 
mon set of hypotheses, which confines to those specified for 
Eq. (y,^2|j8J). Now, we can understand that the Ho statistics 
given in Eq. (Kq) has its foundation in a context of distance 
estimates. 

6. Conclusion 

We present a general framework to estimate the Hubble con- 
stant, as well as the distances of galaxies, when their peculiar 
velocities are neglected, by using distance estimators given by 
the TuUy-Fisher, or the Faber- Jackson relations. Such relations 
can be regarded as a single law describing the observed linear 
correlation between the absolute magnitude M of galaxies and 
their line width distance indicator p. This well known problem 
has been enlightened by taking into account a random vari- 
able ^ of zero mean which accounts for an intrinsic scatter of 
the TF relation (M = a.p + b — Q. The method consists of 
two steps : the a priori choice of a statistical model, which 
is defined essentially on working hypotheses about the data 
distributions; and the derivation of parameters statistics by 
means of the maximum likelihood technique. This method has 
the advantage of providing unbiased estimates of model pa- 
rameters, as long as the selection effects are taken into account 
by the statistical model. As standard, we assume a magnitude 
limited (complete) sample of uniformly distributed sources in 
space which shows a gaussian luminosity distribution function, 
although this method can easily be extended to more realistic 
situations. It turns clear that the presence of p-selection effects 
(which is not investigated here) makes this problem much more 
difficult, although some results require even weaker hypothe- 
ses. 

We show that the "Direct TuUy-Fischer" and the "Inverse 
TuUy-Fischer" methods identify as maximum likelihood statis- 
tics related to particular models (herein, denoted ITF and 
DTF), whose difference limits on describing the TF diagram 
in a different way. At first glance, one might wonder whether 
such an a priori choice is justified since these models replace 
the one which should be prescribed by the physics of galax- 
ies (responsible for the M-p correlation) , and which is not yet 
known. Fortunately, it is reassuring to point out that the esti- 
mates of galaxies distances and Ho are not model dependent, 
contrarily to calibration parameters a and b. Actually, these 
models belong to a wide class of models, and both of them can 



be interpreted as a choice of a particular "orientation" for fit- 
ting the TF relation (according to usual definitions). However, 
the advantage of using models instead of fitting approaches is 
that one avoids subjective interpretations, for having clear-cut 
and unambiguous results. For example, we easily understand 
that, in order to obtain meaningful estimates, the calibration 
of the TF relation and the determination of Ho, or the dis- 
tances of galaxies, has to be performed within the same model, 
regardless of selection effects. Moreover, it turns out that the 
Ho statistics are still valid when additional selection effects (or 
sampling rules) are present, which informs us on the robustness 
of these statistics. For example, in the case of the ITF model, 
selection effects with respect to distance modulus, or redshift, 
and/or M, do not perturb the estimate. On the other hand, 
in the case of the DTF model only additional selection effects 
with respect to the redshift are allowed. 

The main result which ends the well known debate is 
that the ITF and DTF estimates show identical expectancies. 
Namely, the difference of estimating Ho, resp. a distance mod- 
ulus fi, by mean of ITF or DTF statistics (considered as a 
random variables) have vanishing mean values and a standard 
deviation of order of a"c7\/l/^cai + 1/A^, resp. a^'y^JlfNcIi, 
where A^cai is the size of the calibration sample, A'^ is the 
size of the sample used to determine Ho, and where the ra- 
tio 7 = oc^/um informs us on the gain of accuracy when using 
the TF diagram. In practice, they are different only because of 
statistical ffuctuations. It is interesting to point out that these 
approaches provide with us the same Ho estimates when the 
calibration sample and the sample used to estimate Ho show 
identical p-averages (herein called ^^Cp- criteria"). 

Therefore, the choice between the ITF and the DTF ap- 
proaches should be motivated by arguments about selection 
effects, accuracy and robustness of estimates. Actually, in the 
case of Ho estimates, only the first criterion intervenes since 
the ITF and the DTF approaches show identical accuracy and 
robustness. With this in mind, we introduce a newly defined 
Ho statistics, whose related model (herein denoted by ITF*) 
includes the ITF model, where no hypothesis is required on the 
luminosity distribution function of sources, on their spatial dis- 
tribution, and it is still valid when the sample is not complete. 
While it is a little less accurate (by a factor of a^/I -1-7-^), its 
advantage is to be much more robust than the ones related to 
the ITF and the DTF models. Finally, simulations show that 
Ho can be estimated with an accuracy of the order of 5% (Icr), 
which takes into account calibration and measurement errors 
(actually the first ones prevail on the other ones). In the case 
of distances, it turns out that the DTF estimate is more ro- 
bust than the ITF estimate, because it does not depend on 
the luminosity distribution of sources. Both estimates show a 
correction for a bias, inadequately believed to be of Malmquist 
type. 

A. Notations and useful formulas 

The mat hemati cal formalism is similar to the one used in Bigot 
& Triay (1990a). The following features are addressed through- 
out the text by using the symbol "_De/.". 



Def.l The probability density (pd) of a random variable x reads 
dP{x) = f{x)dx, where f{x) represents the pd function 
(pdf), we have J dP{x) = 1. Sometimes, it is useful to 
exhibit the model parameters involved in the statistical 
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model, as the mean xq and the standard deviation a, by 
writing f{x; Xq, cr). 

(a) gG{x;xo,(T) = (ct^Stt)"^ exp- ((a; - a::o)^/(2cr^)) is a 
Gaussian pdf. 

(b) A normal pdf can be written gi<i{x) = ga{x;0, 1). 

(c) The cumulative Normal pdf leads J^ (x) = J_ g]si{t)dt. 
Def.2 Let / be a pdf and A be a scalar value, in most of calcula- 
tions, we use the following properties : 

(a) f{x + A; xo, o) = /(a;; xq - A, a); 

(b) /(Aa;;xo,a)=A-\f(x;Si,z): 

{c) eyip{Xx)gG{x;xo,(j) = exp ( A(a:o + A^) j gG(a::; a;o + 

\g\o). 

(d) gG(a:; 3:1, 0-1) 30(3;; X2, 0-2) = gG{x;xo,(Jo)gG{xi;x2,cr), 
where a = -y/crj^ + a|, a;o and ctq are defined as follows 
(T^^ = aj"^ + cr^^ and xoa^^ = a;i(Jf ^ + a;2cr^^. 

P{h) = J h{x)dP{x) denotes the expected value of the 
function h{x). 



according to hypotheses (/ii,/i2,^3,^4), see Eq. (y,hjlllfl,p4), 
then the normalization factor is given by 



-Pth(<^m) oc exp/3 (miim - Mo + ^'^li) 



B.l. Calibration statistics 



(B2) 



According to Eq. (|,0jl2[ |l|,^, the pd given in Eq. (|) reads 
in terms of observables as 



dPobB = ^^4^T-r-fMiM;Mo,aM)dM K{fj.)dfj. 
X gGia p + b; M, a(^)adp, 



(B3) 



Def.Z 

DefA The pd of a sample data {t/fc}^_-^ ^, which consists of TV in- 
dependently selected objects Qk, is given by Ofe=i d-P{Gk)- 

(a) Its pdf, written in terms of observables (the measurable 
random variables), but regarded as a function of model 
parameters, provides us with the likelihood function. 

(b) [The ML method.) The model parameters statistics are 
obtained by maximizing the likelihood function, or 
(equivalently) the natural logarithm of the efficient part 
of it, in which the terms which do not contribute to the 
determination of parameters are removed, herein briefly 
denoted by If. 

Def.5 We use the following usual definitions : 
(a) (x) = X]t_, Xk./N is the average. 



where the first right hand term is independent of a and 6, 
see Eq. (Bl). Therefore, the If C^^ {a,b,aQ) can be written as 
follows 



„iTF , , 1 >r^ [a.pk + b-MkY 

fc=i '■ 



(B4) 



Hence, the ml equations (obtained by equating the partial 
derivatives of £}^ with respect to a, &, and ct^ to zero) reads 



a{p(ap-|-6 — i\f))i = gq, 

a{p)i + b = {M)i, 
{{ap + b-Mf)i = al 



(B5) 
(B6) 
(B7) 



(b) Cov(a;,y) = J^k^ii^^ - {x))iyk - {y))/{N - 1) is the IM{ap + b-M)) 
covariance, 

(c) S(a;) = yJCaw{x,x) is the standard deviation, 

(d) p{x,y) — Cov{x,y)/{'E{x)'E{y)) is the correlation coef- 
ficient. 



By expanding Eq.(B5) as a{p{ap + b- M)) +b{{ap + b- M)) - 



cjq. Hence, it follows that 



According to Eq. (B5), the first left hand term is equal to 
_2 — 



Def.6 The problem of biases in Statistics Theory is well estab- 
lished : an estimator (or statistic) is biased when its ex- 
pected value does not correspond to model parameter for 
which it has been made up. In practice, a bias is expected 
when the normalization fa ctor depends on the model pa- 
rameter, see Bigot & Triay ( 



According to Eq. (B5), the second left hand term is zero. 

Thus we have a{pM} + b{M) = (M^). 

Hence, by subtracting (M) x (B5), one obtains Eq.(|29). 



1990a 



1990b ) . For instance, the 



Def.7 



average of absolute magnitudes, as provided by a sample 
of objects brighter than a given limiting apparent magni- 
tude, is a biased estimator of the mean intrinsic magnitude 
(that characterizes the population of sources). Herein, such 
a bias is designated as bias of Malmquist type, a definition 
which can be extended to any bias due to selection effects. 
The accuracy of an estimator is formally defined as the 
reciprocal of its variance (The smaller the dispersion, the 
greater the precision.). 



Equation (29|) follows immediately from Eq. (B5). By subtract- 
ing a(p)x(B|) toEq.(|B^), we obtain a^({p^)-{p)^)-a({pM)- 
{p){M)) =^, which gives Eq. (|2|). 

B.2. Determination of Hq 

According to Eq. (§,0jl2[ |l|,§|), the pd given in Eq. (|) reads 
in terms of observables x, y and rj, see Eq. (HiMGd), as follows 



rfPobs = ^^^^P^fM{x + n;Mo,aM) 

J-th[(pm) 

K.{rj — H)dxdrj x (?g(2/; "W, cfQ)dy. 



(B8) 



B. Calculations involved in the ITF model 

According to Eq. (WplGq) , the normalization factor reads 

(t>m{M + ii) fM{M; Mo, aM)dM n{fi)dfi, (Bl) 

which shows that it does not depend on parameters a, b and 
Ho. Let us note that if we specify the functions </>„, k and fM, dP.^ 



It is important to note that this pd reads as a product of two 
independent pds, and thus that 

the distribution of the random variable y does not de- 
pend on X and rj, whatever the form of functions /a/, k 
and dm. 



The integration over x and rj yields 



ITF* 



■ gG{y;'H,a(;)dy, 



(B9) 
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which shows that the expected value of y provides us with 
Pobs{y) = "W. Therefore, the Hubble constant can be estimated 
by means of the statistic given in Eq. (B3). Moreover Eq. (B9) 



shows that the standard deviation of the ^-distribution is equal 
to aQ. Thus, the standard deviation of the statistic providing 
^0 isgivenby Eq. (H). 

If we specify the functions k and /m, then we can per- 
form the ML technique. By assuming hypotheses (h2,h3,hi), 
see Eq. (0|l6[||), the If is given by 



iV2 



•Cdct (-'^o) 



{Vk^-Hf 



2ol 



--y 

*,— 1 

iV, 



m 



^^li 



(BIO) 



since the normalization factor does not depends on Ho- Hence, 
the likelihood equation_(d/lJj^f /diifo = 0) provides us with the 
statistic given in Eq. (B4|) . 

In order to estimate the accuracy of the statistic (p4), we 
have to calculate the pdf of the following random variable 



ry + a^ {{Mo - (3alj) - x) 



(Bll) 



C.l. Calibration statistics 



According to Eq. (B2,[]l, CI), which shows that Pth(<?f>m) de- 
pends indeed on parameters a, b, po, Op, and cr^, and to 
Eq.(g||ji|, ^), the ;/£D7(a,6,ac,po,<Tp) can be written 
as follows 



^DTF 



lnPth( 



InCTj, 



, 1 Y^ {a.pk+b-Mk) 
1^-C-^l^ 1^ 



JVi 



1 Y^ [Pk-Po] 



fc=i 



(C3) 



According to Eq. (C3), the ml equations (obtained by equating 
the partial derivatives of CJ^"^ with respect to a, 6, po, ""p and 



(T^ to zero) can be written 



{p{ap + b~ M))-i = Pal {po ~ l3alaj 

a{p}i + b = {M)i+l3al, 

Po = {p)i + Po'la, 

((p-Po)^)i = CTp (1 + /^^o-pO^) , 

{{ap + b-Mf)^ = al{l + l3^al). 



(C4) 
(C5) 
(C6) 
(C7) 
(C8) 



Obvious calculationscJ give 



a 



^p^ITF ^ g(.[z;n, 



c^C 



ViT' 



--)dz, 



(B12) 



Equations (C4,C4) show that o-p = S(p), while the Malmquist 
bias intervenes in the statistic Eq. (ICJ). According to Eq.dC4), 
Eq. (^) - {p) X Eq. (|c^ yields Eg. (^). By expanding {{ap + 
b - My), according to Eq.(|c^,|§), we obtain Eq. {^. 



see Eq. (|l|). According to Eq. (p9|,p| jB12[ )^d since 
Ei(M), we obtain the accuracy given in Eq. (p5|). 



0"M 



C. Calculations involved in the DTF model 

Now, according to Eq. (|[0|[ |l|,^^), it turns out-that the 
random variables M, p and /i are correlated togethertJ. Hence, 
the normalization factor Pth(<^m) becomes dependent on model 
parameters a and 6. Thus, for proceeding with the ml tech- 
nique, we have to calculate explicitly Pth{4'm) and its deriva- 
tives with respect to model parameters, which forces us to pre- 
sume a priori the form of functions (jtmir n) , fr, (p; po, ay) and 
fv(/i). We asstwne (/ii,/i2,/i3,/i4), see Eq. (p3|, [Ll|j3^ ) . Hence, after 
little al gebr aic, it turns out that the normalization is still given 
by Eq.(B|), with 



Mo = apo + b, 



CM = \ a^a-p + ai 



(CI) 
(C2) 



^^- Eq. ( |Bq ) is written in terms of variables x, rj and z, accord- 
ingly to Eq. (y,hiy24), - one integrates over r), and hence over 
X, we use (Z)e/.2.c,d). 

^^- M and p, because of the TF diagram, - M and p,, because 
the selection function 4>m (M + n) does not split into a prod- 
uct of two functions, - n and p, as a consequence of above 
correlations. 

^^The calculation is straightforward by means of by part inte- 
grations, successively over /i, M, and finally p, where {Def. 2.c) 
is used twice. 



C.2. Determination of Hq 

According to Eq. (IsUtI) , the p d (M) reads in terms of observables 
X, y and rj, see Eq. (0,pol,po|), as follows 



<^m(a: + r?) x + y-b 



K(r] — H)-dxdTi X gG{v\'H, cfc)dy 
a 



(C9) 



Equations (B2,C1 CI) show that the normalization factor 
Pt h((/ ) m,) does not depend on Ho- Thus, according to Eq. 
(1,0,011), the //reads 



^DTF 
■'-dGt 



fc=i 



2cr2 



m 



(CIO) 



Hence, the likelihood equation (d£pi /dHo = 0) provides 
us with Eq. (|43). Obvious calculationscJ provide us with the 
pd/ describing the distribution of the random variable y, 



dP^ 



■ gG{y;'H + I3a^,a(;)dy, 



(Cll) 



which shows that the standard deviation is given by 



Eq. 



One integrates the pdf given in Eq. (C9) over v. and after 
over X, we use (Z)e/.2.a,b,c), which gives Eq. ([B2 CI, CI). 



14 



D. Differences on the data description 

In this section, we show that the DTF model and the ITF 
model describe the data distribution in a different way. This 
statement can easily be proved by supposing the antithesis, 
which is that the model parameters a and b are identically de- 
fined in both models (and thus also the random variable ("). 
Indeed, if a and b are the same in both models, the luminos- 
ity distribution function /a/ can be calculated according to 
Eq. (|) but within the DTF model, as given by Eq. (|^. Now, 
by writing the pdffp according to Eq. (h), thus within the ITF 
model, see Eq. (|28[), we obtain two integrals that we-itranspose 
for obtaining the following compatibility conditiontJ 



fMiM) 



fM{t)gGit;M,V2ac)dt, 



fM{M)+aldi,f{M) 



(Dl) 



which cannot be achieved, while such a disagreement is not 
so drastic as that if the luminosity distribution function varies 
weakly within ranges of the order of \/2o"f . 

E. Biases due to measurement errors 

In order to calculate the magnitude of biases related to mea- 
surement errors, we have to calculate the normalization factor 

^th ( Pt (4>m) I), see Eq. (pq). It turns out that one needs to 

specify the function K(pi), and thus we assume a uniform spa- 
tial distribution of sources, i.e., (fe2), see Eq. ([ll[). Hence, it 
is clear that the integrations over the eA give unity, excepted 
for the one over £„, because of selection effects, see Eq. (|69|). 
The calculation becomes evident if we use the dummy variable 
ft = ^ + tm, so we have «:(/i) = j^ift) exp (— /3em). and then by 
using (Z)e/.2.b,c) we obtain Eq. ([72[). Hence, Eq.(p8|) transforms 
as follows 



rf^bl = 



dPti 



X exp 



{-> 



(s)\ 



dp: 



(s) 



(El) 



where it becomes clear that the selection function (j>m plays 
the role of a correlation function between the variables m and 
em ■ If the measurement errors were known then one could re- 
store the values of observables from Eq. (nSl-tSS) , and then use 
the ML technique for obtaining genuine statistics. In such a 
case, according to Eq. (Rll), and because one has necessarily 
(pmirrih) = 1 for all individual datum (fc = 1,A'^), one un- 
derstands that one still obtains identical statistics to the ones 
given by Eq. (p2y34,k3), where the errors are ignored. However, 
since (in practice) the measurement errors are not known, the 
eA-dependent parts of these statistics are substituted by their 
expected value according to the p.d. given in Eq. (El). Let us 
proceed with preliminary calculations. It is easy to show that 



d{s) 



(^a) 




if tx = £m 

otherwise 



and 



d(^) 



ex 



P^.^Aex. 



(- 



^I') 



(E2) 



(E3) 



The approximation is obtained by expanding the right hand 
term, see also {Def. 2.d). 



Accordingly to Eq. (p|, p3|J63| ) , for Stepl, let us define the 
following variables 



M 



m — fi 



ehi = e 



m ^fj. , 



SO that the absolute magnitude reads 

M = M - eM- 



(E4) 
(E5) 

(E6) 



Note that, because e™ and e^ are independent random vari- 
ables, we have 

(-S)'-(-^Lf + KV)^ (E7) 

Therefore, according to Eq. (E2 E4-E7), we have 

{M), = {M),+I3{aill)\ (E8) 

and since M and em are independent, it follows 

E?(M) = (1-5a/)E?(M), (E9) 

see Eq. ([70|). Similarly, it is evident to show that 

{p)i = {p)i, (ElO) 

(Ei(p))^ = (l-5,)(El(p))^ (Ell) 

see Eq. WW, and thus that 

^^(P'^'^)=(l-5,)(1-M- ^""''^ 

For convenience in writing we use the following variable 

5. 



+ S^ 



(E13) 



see E' 



Eq. (I29H2 

ITF 



E8 



|70|). In the case of the ITF model, according to 



a (1 - dM) , 



,ITF 



13|) one has 

(E14) 
6-' = 6-' - Sm&'^^(J>}i + P {ai'2f , (E15) 

(ar)'-(l-5M)(^r)' 

- ea, (1 - 5m) (1 - 5p) (a'"^^)' (Si(p))^ (E16) 

Hence, according to Eq. (|20|), we easily obtain Eq. ([73|). Simi- 



larly, for the DTF model, we obtain 

-DTF 
DTF ffl 



il-Sr)' 

e.^[a^^^{p),^p{ar^Y + f3{E,{M)y 



(E17) 



+ H-ill)" 



(Si(M)) = 



/ DTF\^ ^ /~DTF\^ 

Hence, according to Eq.(pd), we easily obtain Eq.([73|). 



(E18) 
(E19) 
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